World-wide Covid-19 data will be downloaded using the
COVID19 R package. This package is able to download
COVID-19 data across governmental sources at national, regional, and
city level, as described in Guidotti and Ardia (2020) (doi:10.21105/joss.02376).
It also includes policy measures by ‘Oxford COVID-19 Government Response
Tracker’ (https://www.bsg.ox.ac.uk/research/research-projects/coronavirus-governmentresponse-tracker)
For more info on this unified dataset, visit their data hub (https://covid19datahub.io/).
The covid19 datasets contain a lot of information such as:
confirmed)deaths)hosp)administrative_area_level_1 for data1administrative_area_level_2 for data2administrative_area_level_1 or
administrative_area_level_2 (population)school_closing,
cancel_events,gathering_restrictions,…)
recovered)'data.frame': 3645 obs. of 47 variables:
$ id : chr "1bb2de77" "1bb2de77" "1bb2de77" "1bb2de77" ...
$ date : Date, format: "2020-03-01" "2020-03-02" ...
$ confirmed : int 6 10 12 17 19 28 33 39 51 64 ...
$ deaths : int NA NA NA NA NA NA 1 NA NA 2 ...
$ recovered : int NA NA NA NA NA NA NA NA NA NA ...
$ tests : int 4 21 50 87 121 203 226 248 303 357 ...
$ vaccines : int NA NA NA NA NA NA NA NA NA NA ...
$ people_vaccinated : int NA NA NA NA NA NA NA NA NA NA ...
$ people_fully_vaccinated : int NA NA NA NA NA NA NA NA NA NA ...
$ hosp : int NA NA NA NA NA NA NA NA NA 4 ...
$ icu : int NA NA NA NA NA NA NA NA NA 0 ...
$ vent : int NA NA NA NA NA NA NA NA NA 1 ...
$ school_closing : int 0 0 0 0 0 0 0 0 0 0 ...
$ workplace_closing : int 0 0 0 0 0 0 0 0 0 0 ...
$ cancel_events : int 0 0 0 0 0 0 0 0 0 1 ...
$ gatherings_restrictions : int 0 0 0 0 0 0 0 0 0 0 ...
$ transport_closing : int 0 0 0 0 0 0 0 0 0 0 ...
$ stay_home_restrictions : int 0 0 0 0 0 0 0 0 0 0 ...
$ internal_movement_restrictions : int 0 0 0 0 0 0 0 0 0 0 ...
$ international_movement_restrictions: int 0 0 0 1 1 1 1 1 1 1 ...
$ information_campaigns : int 2 2 2 2 2 2 2 2 2 2 ...
$ testing_policy : int 1 1 1 1 1 1 1 1 1 1 ...
$ contact_tracing : int 2 2 2 2 2 2 2 2 2 2 ...
$ facial_coverings : int 0 0 0 0 0 0 0 0 0 0 ...
$ vaccination_policy : int 0 0 0 0 0 0 0 0 0 0 ...
$ elderly_people_protection : int 0 0 0 0 0 0 0 0 0 0 ...
$ government_response_index : num -14.6 -14.6 -14.6 -16.1 -16.1 ...
$ stringency_index : num -11.1 -11.1 -11.1 -13.9 -13.9 ...
$ containment_health_index : num -16.7 -16.7 -16.7 -18.4 -18.4 ...
$ economic_support_index : num 0 0 0 0 0 -62.5 -62.5 -62.5 -62.5 -62.5 ...
$ administrative_area_level : int 2 2 2 2 2 2 2 2 2 2 ...
$ administrative_area_level_1 : chr "Belgium" "Belgium" "Belgium" "Belgium" ...
$ administrative_area_level_2 : chr "Bruxelles" "Bruxelles" "Bruxelles" "Bruxelles" ...
$ administrative_area_level_3 : chr NA NA NA NA ...
$ latitude : num 50.8 50.8 50.8 50.8 50.8 ...
$ longitude : num 4.37 4.37 4.37 4.37 4.37 ...
$ population : int 1208542 1208542 1208542 1208542 1208542 1208542 1208542 1208542 1208542 1208542 ...
$ iso_alpha_3 : chr "BEL" "BEL" "BEL" "BEL" ...
$ iso_alpha_2 : chr "BE" "BE" "BE" "BE" ...
$ iso_numeric : int 56 56 56 56 56 56 56 56 56 56 ...
$ iso_currency : chr "EUR" "EUR" "EUR" "EUR" ...
$ key_local : logi NA NA NA NA NA NA ...
$ key_google_mobility : chr "ChIJ_58PdIbEw0cRMIBML6uZAAE" "ChIJ_58PdIbEw0cRMIBML6uZAAE" "ChIJ_58PdIbEw0cRMIBML6uZAAE" "ChIJ_58PdIbEw0cRMIBML6uZAAE" ...
$ key_apple_mobility : chr "Brussels" "Brussels" "Brussels" "Brussels" ...
$ key_jhu_csse : chr NA NA NA NA ...
$ key_nuts : chr "BE1" "BE1" "BE1" "BE1" ...
$ key_gadm : chr "BEL.1_1" "BEL.1_1" "BEL.1_1" "BEL.1_1" ...
Use data1 to visualize the daily confirmed cases in
Belgium over time with a colored line. Also make sure all
months (with year) appear on the x-axis and give the graph a title.
# Enter your solution here
ggplot(data1,aes(x=date,y=confirmed_daily,col=administrative_area_level_1)) +
geom_line() +
scale_x_date(date_breaks="1 month", date_labels = format("%B %Y"))+
theme_bw() + labs(title="Daily confirmed cases in Belgium") +
theme(axis.text.x = element_text(angle=60,hjust=1))You will see that the line of confirmed daily cases jumps up and down consistently due to a weekend effect. Add a smooth line through these data points instead. (Tip: a span of 0.2 for loess gives a good result)
# Enter your solution here
ggplot(data1,aes(x=date,y=confirmed_daily,col=administrative_area_level_1)) +
geom_line() +
scale_x_date(date_breaks="1 month", date_labels = format("%B %Y"))+
theme_bw() + labs(title="Daily confirmed cases in Belgium") +
theme(axis.text.x = element_text(angle=60,hjust=1)) +
geom_smooth(span=0.2, se=FALSE)Finally, use your previous code to make 2 plots: one with the line
fitting the data directly and one with the smoothed line. Combine these
plots into a single plot grid using either the gridExtra or
cowplotpackage.
data2date,
hosp, administrative_area_level_2geom_line,
facet_wrap, scale_color_manualFor this and all following exercises you will be using
data2. Visualize the number of hospitalizations over time
in the 3 main regions of Belgium. Make sure the 3 regions are separated
in 3 facets and give each line (for each region) a manual color! (pick
your favorite colors)
# Enter your solution here
ggplot(data2, aes(x=date, y=hosp, col=administrative_area_level_2)) +
geom_line(size=1) + facet_wrap(~administrative_area_level_2) +
theme_bw() + labs(y="Number of Hospitalizations") +
scale_color_manual(values=c("black","gold","red"))Change your coloring variable to
gatherings_restrictions. Make sure discrete colors are used
and not a gradient! Also make sure that you obtain a single line with
multiple colors and not multiple lines with a single color. If you are
interested in what the multiple level of restrictions mean, check out https://github.com/OxCGRT/covid-policy-tracker/blob/master/documentation/codebook.md#containment-and-closure-policies
. (Tip: group=1)
# Enter your solution here
ggplot(data2, aes(x=date, y=hosp, col=factor(gatherings_restrictions))) +
geom_line(size=1,aes(group=1)) + facet_wrap(~administrative_area_level_2) +
theme_bw() + labs(y="Number of Hospitalizations") +
theme(axis.text.x = element_text(angle=60,hjust=1))For this visualization first recreate a similar plot as in exercise 2 (you should be able to recycle most of your earlier code). However this time, color by region, do not use facets, and make sure the y-axis shows number of hospitalizations over the regional population size. The latter makes the values more comparable between regions.
data2date,
hosp, administrative_area_level_2,
populationgeom_line,
ggplotlyNow transform your ggplot in an interactive plot and add the
following additional tooltips: (1) Total number of hospitalizations, (2)
regional population size, (3) any other number of variables that are of
interest to you (e.g. restrictions) (Tip:
group=administrative_area_level_2)
# Enter your solution here
p <- ggplot(data2, aes(x=date, y=hosp/population, col=administrative_area_level_2,
text=paste0("Total Hospitalized: ",hosp,"\nPopulation: ",population,"\nSchool Closing: ",school_closing,"\nGathering Restriction: ",gatherings_restrictions))) +
geom_line(size=1,aes(group=administrative_area_level_2)) + # Plotly needs to know what each line represents
theme_bw() + labs(y="Number of Hospitalizations / Regional Population Size",title="Hospitalization in Belgium") +
scale_x_date(date_breaks="1 month", date_labels = format("%B %Y")) + theme(axis.text.x = element_text(angle=60,hjust=1)) +
scale_color_manual(values=c("black","gold","red"))
ggplotly(p)Use the static plot of exercise 3 and animate it however you see fit following one of the approaches in the course slides. You do not need to stick to the line plot.
data2date,
hosp, administrative_area_level_2,
populationgeom_line,
geom_point, transition_reveal,
transition_time, transition_states# Enter your solution here
p1 <- ggplot(data2, aes(x=date, y=hosp/population, col=administrative_area_level_2)) +
geom_line(size=1) + geom_point(size=2) +
theme_bw() + labs(y="Number of Hospitalizations / Regional Population Size",title="Hospitalization in Belgium: {frame_along}") +
scale_color_manual(values=c("black","gold","red")) +
scale_x_date(date_breaks="1 month", date_labels = format("%B %Y")) + theme(axis.text.x = element_text(angle=60,hjust=1)) +
transition_reveal(date)
animate(p1, width=800,height=400)
# Note: We could have just have used the ggplot statement without saving it in `p1` and then using `animate()`. However, doing it in 2 steps, gives us more control over animation options such as width, height, number of frames, rewind, fps,...From scratch, use data3 to create a plot that shows the
deaths per 100.000 people for every week over time for
all of the included countries (Belgium, Czech Republic, France, Germany,
Netherlands, United Kingdom). Either make this plot interactive by
adding tooltips or animate with your favorite animation.
You will need to do some data manipulation to add the daily deaths and the weekly deaths per 100.000. If you get stuck here, don’t be afraid to head over to your best friend google/stackoverflow to find an easy/creative solution.
# Enter your solution here
# Add daily deaths (the same as daily confirmed was added above)
data3 <- data3 %>% arrange(date) %>% group_by(administrative_area_level_1) %>% mutate(deaths_daily=c(.data$deaths[1],diff(deaths))) %>% ungroup()
# Add a date_weeks variable that is the first day at the beginning of every week. We use the round_date() function from the lubridate package for this.
# Then, we summarize the daily deaths over date_week
data3 <- data3 %>% mutate(date_week = lubridate::round_date(date,"week")) %>%
group_by(administrative_area_level_1, date_week) %>%
mutate(deaths_week = sum(deaths_daily))
p <- ggplot(data3, aes(x=date_week, y=deaths_week/population*100000, col=administrative_area_level_1,
text=paste0("confirmed: ",confirmed,"\ndeaths: ",deaths))) +
geom_line(size=1,aes(group=administrative_area_level_1)) +
scale_x_date(date_breaks="1 month", date_labels = format("%B %Y"))+
theme_bw() + labs(title="Weekly deaths per 100.000", y="Biweekly deaths per 100.000",x="") +
theme(axis.text.x = element_text(angle=60,hjust=1))
ggplotly(p)---
title: "Flexdashboard: Interactive & Animated Plotting"
output:
flexdashboard::flex_dashboard:
vertical_layout: fill
theme: yeti
source_code: embed
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
```
```{r loadPackages, include=FALSE}
# install.packages("gifski") # Make sure the gifski package is installed as well. It creates a gif out of the frames gganimate produces
library(ggplot2)
library(plotly)
library(tidyr)
library(dplyr)
library(gganimate)
library(gridExtra)
library(cowplot)
library(COVID19) # To download data
```
```{r getData, message=FALSE, warning=FALSE}
# Get COVID19 data from Belgium at national level
data1 <- covid19(country=c("Belgium"),start="2020-03-01", verbose=FALSE)
data1 <- data1[data1$date<=(Sys.Date()-2),] # Remove last 2 days, may still be incomplete
data1 <- data1 %>% arrange(date) %>% group_by(administrative_area_level_1) %>% mutate(confirmed_daily=c(data1$confirmed[1],diff(confirmed))) %>% ungroup() # Add daily confirmed cases
# Get COVID19 data from Belgium at regional level
data2 <- covid19(country=c("Belgium"),start="2020-03-01",level=2, verbose=FALSE)
data2 <- data2[data2$date<=(Sys.Date()-2),] # Remove last 2 days, may still be incomplete
data2 <- data2[data2$administrative_area_level_2!="Ostbelgien",] # We remove the Ostbelgien entries since they do not contain the number of confirmed cases or number of hospitalisations
# Get COVID19 data from multiple countries at national level
data3 <- covid19(country=c("Belgium","CZE","Netherlands","France","Germany","United Kingdom"),start="2020-03-01", verbose=FALSE)
data3 <- data3[data3$date<=(Sys.Date()-2),] # Remove last 2 days, may still be incomplete
# In case the code above fails to download the data, please load a pre-downloaded version here:
# load(".../data/covid19_belgium.RData")
```
Covid-19 Data
======================
Column
--------------
### Data {.no-title data-height=500}
<p align="center">
[{width=200}](https://covid19datahub.io/)
</p>
World-wide Covid-19 data will be downloaded using the `COVID19` R package. This package is able to download COVID-19 data across governmental sources at national, regional, and city level, as described in Guidotti and Ardia (2020) (doi:10.21105/joss.02376). It also includes policy measures by 'Oxford COVID-19 Government Response Tracker' (https://www.bsg.ox.ac.uk/research/research-projects/coronavirus-governmentresponse-tracker)
For more info on this unified dataset, visit their data hub (https://covid19datahub.io/).
### Info {data-height=500}
The covid19 datasets contain a lot of information such as:
* confirmed cases (`confirmed`)
* number of deaths (`deaths`)
* number of hospitalized patients (`hosp`)
* The level at which the numbers were recorded
+ `administrative_area_level_1` for `data1`
+ `administrative_area_level_2` for `data2`
* population size of `administrative_area_level_1` or `administrative_area_level_2` (`population`)
* numerous restrictions (`school_closing`, `cancel_events`,`gathering_restrictions`,...)
+ more info on the meaning of the restriction levels can be found at https://github.com/OxCGRT/covid-policy-tracker/blob/master/documentation/codebook.md#containment-and-closure-policies
* number of recovered cases (`recovered`)
* ...
Column
--------------------
### Structure {data-height=800}
```{r showData}
str(data2)
```
Exercise 1 {data-navmenu="Exercises ggplot" data-orientation=cols}
================================================================================================
Overview {.sidebar data-width=400}
-------------------------------------
### Overview
* **Data:** `data1`
* **Variables of interest:** `date`, `confirmed_daily`, `administrative_area_level_1`
* **Functions/tips**: `geom_line`, `scale_x_date`,`geom_smooth`
Column {.tabset .tabset-fade}
--------------------------
### Question & Code
Use `data1` to visualize the daily confirmed cases in Belgium over time with a *colored* line. Also make sure all months (with year) appear on the x-axis and give the graph a title.
```{r exercise1_solution1, echo=TRUE,eval=FALSE}
# Enter your solution here
ggplot(data1,aes(x=date,y=confirmed_daily,col=administrative_area_level_1)) +
geom_line() +
scale_x_date(date_breaks="1 month", date_labels = format("%B %Y"))+
theme_bw() + labs(title="Daily confirmed cases in Belgium") +
theme(axis.text.x = element_text(angle=60,hjust=1))
```
You will see that the line of confirmed daily cases jumps up and down consistently due to a *weekend effect*. Add a smooth line through these data points instead. (**Tip:** a span of 0.2 for *loess* gives a good result)
```{r exercise1_solution2,eval=FALSE, echo=TRUE}
# Enter your solution here
ggplot(data1,aes(x=date,y=confirmed_daily,col=administrative_area_level_1)) +
geom_line() +
scale_x_date(date_breaks="1 month", date_labels = format("%B %Y"))+
theme_bw() + labs(title="Daily confirmed cases in Belgium") +
theme(axis.text.x = element_text(angle=60,hjust=1)) +
geom_smooth(span=0.2, se=FALSE)
```
Finally, use your previous code to make 2 plots: one with the line fitting the data directly and one with the smoothed line. Combine these plots into a single plot grid using either the `gridExtra` or `cowplot`package.
### Plot
```{r exercise1_solution3, echo=FALSE,fig.width=14, eval=TRUE}
# Enter your solution here
p1 <- ggplot(data1,aes(x=date,y=confirmed_daily,col=administrative_area_level_1)) +
geom_line() +
scale_x_date(date_breaks="1 month", date_labels = format("%B %Y"))+
theme_bw() + labs(title="Daily confirmed cases in Belgium") +
theme(axis.text.x = element_text(angle=60,hjust=1))
p2 <- ggplot(data1,aes(x=date,y=confirmed_daily,col=administrative_area_level_1)) +
scale_x_date(date_breaks="1 month", date_labels = format("%B %Y"))+
theme_bw() + labs(title="Daily confirmed cases in Belgium") +
theme(axis.text.x = element_text(angle=60,hjust=1)) +
geom_smooth(span=0.2, se=FALSE)
# gridExtra solution
grid.arrange(p1,p2,ncol=2)
```
Exercise 2 {data-navmenu="Exercises ggplot" data-orientation=rows}
===============================================
Row {data-height=150}
---------------------
### Overview
* **Data:** `data2`
* **Variables of interest:** `date`, `hosp`, `administrative_area_level_2`
* **Functions/tips**: `geom_line`, `facet_wrap`, `scale_color_manual`
Row
------------
### Questions & Code
For this and all following exercises you will be using `data2`. Visualize the number of hospitalizations over time in the 3 main regions of Belgium. Make sure the 3 regions are separated in 3 facets and give each line (for each region) a manual color! (pick your favorite colors)
```{r exercise2_solution1, fig.width=10,echo=TRUE,eval=FALSE}
# Enter your solution here
ggplot(data2, aes(x=date, y=hosp, col=administrative_area_level_2)) +
geom_line(size=1) + facet_wrap(~administrative_area_level_2) +
theme_bw() + labs(y="Number of Hospitalizations") +
scale_color_manual(values=c("black","gold","red"))
```
Change your coloring variable to `gatherings_restrictions`. Make sure discrete colors are used and not a gradient! Also make sure that you obtain a single line with multiple colors and not multiple lines with a single color.
If you are interested in what the multiple level of restrictions mean, check out https://github.com/OxCGRT/covid-policy-tracker/blob/master/documentation/codebook.md#containment-and-closure-policies .
(**Tip:** `group=1`)
```{r exercise2_solution2, fig.width=10,echo=TRUE,eval=FALSE}
# Enter your solution here
ggplot(data2, aes(x=date, y=hosp, col=factor(gatherings_restrictions))) +
geom_line(size=1,aes(group=1)) + facet_wrap(~administrative_area_level_2) +
theme_bw() + labs(y="Number of Hospitalizations") +
theme(axis.text.x = element_text(angle=60,hjust=1))
```
Row
----------
### Plot 1 {data-padding=10}
```{r exercise2_solution1A, fig.width=10,echo=FALSE,eval=TRUE}
# Enter your solution here
ggplot(data2, aes(x=date, y=hosp, col=administrative_area_level_2)) +
geom_line(size=1) + facet_wrap(~administrative_area_level_2) +
theme_bw() + labs(y="Number of Hospitalizations") +
scale_color_manual(values=c("black","gold","red"))
```
### Plot 2 {data-padding=10}
```{r exercise2_solution2A, fig.width=10,echo=FALSE,eval=TRUE}
# Enter your solution here
ggplot(data2, aes(x=date, y=hosp, col=factor(stay_home_restrictions))) +
geom_line(size=1,aes(group=1)) + facet_wrap(~administrative_area_level_2) +
theme_bw() + labs(y="Number of Hospitalizations")
```
Exercises plotly
=========================
### Exercise 3
For this visualization first recreate a similar plot as in exercise 2 (you should be able to recycle most of your earlier code). However this time, color by region, do not use facets, and make sure the y-axis shows number of hospitalizations over the regional population size. The latter makes the values more comparable between regions.
* **Data:** `data2`
* **Variables of interest:** `date`, `hosp`, `administrative_area_level_2`, `population`
* **Functions/tips**: `geom_line`, `ggplotly`
Now transform your ggplot in an interactive plot and add the following additional tooltips: (1) Total number of hospitalizations, (2) regional population size, (3) any other number of variables that are of interest to you (e.g. restrictions)
(**Tip: ** `group=administrative_area_level_2`)
```{r exercise3_solution2, fig.width=10, echo=TRUE, eval=FALSE}
# Enter your solution here
p <- ggplot(data2, aes(x=date, y=hosp/population, col=administrative_area_level_2,
text=paste0("Total Hospitalized: ",hosp,"\nPopulation: ",population,"\nSchool Closing: ",school_closing,"\nGathering Restriction: ",gatherings_restrictions))) +
geom_line(size=1,aes(group=administrative_area_level_2)) + # Plotly needs to know what each line represents
theme_bw() + labs(y="Number of Hospitalizations / Regional Population Size",title="Hospitalization in Belgium") +
scale_x_date(date_breaks="1 month", date_labels = format("%B %Y")) + theme(axis.text.x = element_text(angle=60,hjust=1)) +
scale_color_manual(values=c("black","gold","red"))
ggplotly(p)
```
### Plot
```{r exercise3_solution2A, fig.width=10, echo=FALSE, eval=TRUE}
# Enter your solution here
p <- ggplot(data2, aes(x=date, y=hosp/population, col=administrative_area_level_2,
text=paste0("Total Hospitalized: ",hosp,"\nPopulation: ",population,"\nSchool Closing: ",school_closing,"\nGathering Restriction: ",gatherings_restrictions))) +
geom_line(size=1,aes(group=administrative_area_level_2)) + # Plotly needs to know what each line represents
theme_bw() + labs(y="Number of Hospitalizations / Regional Population Size",title="Hospitalization in Belgium") +
scale_x_date(date_breaks="1 month", date_labels = format("%B %Y")) + theme(axis.text.x = element_text(angle=60,hjust=1)) +
scale_color_manual(values=c("black","gold","red"))
ggplotly(p)
```
Exercises gganimate {data-orientation=rows}
========================
### Exercise 4
Use the static plot of exercise 3 and animate it however you see fit following one of the approaches in the course slides. You do not need to stick to the line plot.
* **Data:** `data2`
* **Variables of interest:** `date`, `hosp`, `administrative_area_level_2`, `population`
* **Functions/tips**: `geom_line`, `geom_point`, `transition_reveal`, `transition_time`, `transition_states`
```{r exercise4_solution1, fig.width=10, warning=FALSE, echo=TRUE, eval=FALSE}
# Enter your solution here
p1 <- ggplot(data2, aes(x=date, y=hosp/population, col=administrative_area_level_2)) +
geom_line(size=1) + geom_point(size=2) +
theme_bw() + labs(y="Number of Hospitalizations / Regional Population Size",title="Hospitalization in Belgium: {frame_along}") +
scale_color_manual(values=c("black","gold","red")) +
scale_x_date(date_breaks="1 month", date_labels = format("%B %Y")) + theme(axis.text.x = element_text(angle=60,hjust=1)) +
transition_reveal(date)
animate(p1, width=800,height=400)
# Note: We could have just have used the ggplot statement without saving it in `p1` and then using `animate()`. However, doing it in 2 steps, gives us more control over animation options such as width, height, number of frames, rewind, fps,...
```
### Plot
```{r exercise4_solution1A, fig.width=10, warning=FALSE, echo=FALSE, eval=TRUE}
# Enter your solution here
p1 <- ggplot(data2, aes(x=date, y=hosp/population, col=administrative_area_level_2)) +
geom_line(size=1) + geom_point(size=2) +
theme_bw() + labs(y="Number of Hospitalizations / Regional Population Size",title="Hospitalization in Belgium: {frame_along}") +
scale_color_manual(values=c("black","gold","red")) +
scale_x_date(date_breaks="1 month", date_labels = format("%B %Y")) + theme(axis.text.x = element_text(angle=60,hjust=1)) +
transition_reveal(date)
animate(p1, width=800,height=400)
# Note: We could have just have used the ggplot statement without saving it in `p1` and then using `animate()`. However, doing it in 2 steps, gives us more control over animation options such as width, height, number of frames, rewind, fps,...
```
Bonus Exercise {data-orientation=cols}
=================
Column {data-width=200}
------------
### Question
From scratch, use `data3` to create a plot that shows the **deaths per 100.000 people for every week** over time for all of the included countries (Belgium, Czech Republic, France, Germany, Netherlands, United Kingdom). Either make this plot interactive by adding tooltips or animate with your favorite animation.
You will need to do some data manipulation to add the daily deaths and the weekly deaths per 100.000. If you get stuck here, don't be afraid to head over to your best friend google/stackoverflow to find an easy/creative solution.
Column
---------------------
### Code
```{r exercisebonus_solution, message=FALSE, warning=FALSE, fig.width=10,echo=TRUE,eval=FALSE}
# Enter your solution here
# Add daily deaths (the same as daily confirmed was added above)
data3 <- data3 %>% arrange(date) %>% group_by(administrative_area_level_1) %>% mutate(deaths_daily=c(.data$deaths[1],diff(deaths))) %>% ungroup()
# Add a date_weeks variable that is the first day at the beginning of every week. We use the round_date() function from the lubridate package for this.
# Then, we summarize the daily deaths over date_week
data3 <- data3 %>% mutate(date_week = lubridate::round_date(date,"week")) %>%
group_by(administrative_area_level_1, date_week) %>%
mutate(deaths_week = sum(deaths_daily))
p <- ggplot(data3, aes(x=date_week, y=deaths_week/population*100000, col=administrative_area_level_1,
text=paste0("confirmed: ",confirmed,"\ndeaths: ",deaths))) +
geom_line(size=1,aes(group=administrative_area_level_1)) +
scale_x_date(date_breaks="1 month", date_labels = format("%B %Y"))+
theme_bw() + labs(title="Weekly deaths per 100.000", y="Biweekly deaths per 100.000",x="") +
theme(axis.text.x = element_text(angle=60,hjust=1))
ggplotly(p)
```
### Plot
```{r exercisebonus_solutionA, message=FALSE, warning=FALSE, fig.width=10}
# Enter your solution here
# Add daily deaths (the same as daily confirmed was added above)
data3 <- data3 %>% arrange(date) %>% group_by(administrative_area_level_1) %>% mutate(deaths_daily=c(.data$deaths[1],diff(deaths))) %>% ungroup()
# Add a date_weeks variable that is the first day at the beginning of every week. We use the round_date() function from the lubridate package for this.
# Then, we summarize the daily deaths over date_week
data3 <- data3 %>% mutate(date_week = lubridate::round_date(date,"week")) %>%
group_by(administrative_area_level_1, date_week) %>%
mutate(deaths_week = sum(deaths_daily))
p <- ggplot(data3, aes(x=date_week, y=deaths_week/population*100000, col=administrative_area_level_1,
text=paste0("confirmed: ",confirmed,"\ndeaths: ",deaths))) +
geom_line(size=1,aes(group=administrative_area_level_1)) +
scale_x_date(date_breaks="1 month", date_labels = format("%B %Y"))+
theme_bw() + labs(title="Weekly deaths per 100.000", y="Biweekly deaths per 100.000",x="") +
theme(axis.text.x = element_text(angle=60,hjust=1))
ggplotly(p)
```